1,115 research outputs found
Implicit Filter Sparsification In Convolutional Neural Networks
We show implicit filter level sparsity manifests in convolutional neural
networks (CNNs) which employ Batch Normalization and ReLU activation, and are
trained with adaptive gradient descent techniques and L2 regularization or
weight decay. Through an extensive empirical study (Mehta et al., 2019) we
hypothesize the mechanism behind the sparsification process, and find
surprising links to certain filter sparsification heuristics proposed in
literature. Emergence of, and the subsequent pruning of selective features is
observed to be one of the contributing mechanisms, leading to feature sparsity
at par or better than certain explicit sparsification / pruning approaches. In
this workshop article we summarize our findings, and point out corollaries of
selective-featurepenalization which could also be employed as heuristics for
filter pruningComment: ODML-CDNNR 2019 (ICML'19 workshop) extended abstract of the CVPR 2019
paper "On Implicit Filter Level Sparsity in Convolutional Neural Networks,
Mehta et al." (arXiv:1811.12495
Example-based learning for single-image super-resolution and JPEG artifact removal
This paper proposes a framework for single-image super-resolution and JPEG artifact removal. The underlying idea is to learn a map from input low-quality images (suitably preprocessed low-resolution or JPEG encoded images) to target high-quality images based on example pairs of input and output images. To retain the complexity of the resulting learning problem at a moderate level, a patch-based approach is taken such that kernel ridge regression (KRR) scans the input image with a small window (patch) and produces a patchvalued output for each output pixel location. These constitute a set of candidate images each of which reflects different local information. An image output is then obtained as a convex combination of candidates for each pixel based on estimated confidences of candidates. To reduce the time complexity of training and testing for KRR, a sparse solution is found by combining the ideas of kernel matching pursuit and gradient descent. As a regularized solution, KRR leads to a better generalization than simply storing the examples as it has been done in existing example-based super-resolution algorithms and results in much less noisy images. However, this may introduce blurring and ringing artifacts around major edges as sharp changes are penalized severely. A prior model of a generic image class which takes into account the discontinuity property of images is adopted to resolve this problem. Comparison with existing super-resolution and JPEG artifact removal methods shows the effectiveness of the proposed method. Furthermore, the proposed method is generic in that it has the potential to be applied to many other image enhancement applications
Active Learning Guided by Efficient Surrogate Learners
Re-training a deep learning model each time a single data point receives a
new label is impractical due to the inherent complexity of the training
process. Consequently, existing active learning (AL) algorithms tend to adopt a
batch-based approach where, during each AL iteration, a set of data points is
collectively chosen for annotation. However, this strategy frequently leads to
redundant sampling, ultimately eroding the efficacy of the labeling procedure.
In this paper, we introduce a new AL algorithm that harnesses the power of a
Gaussian process surrogate in conjunction with the neural network principal
learner. Our proposed model adeptly updates the surrogate learner for every new
data instance, enabling it to emulate and capitalize on the continuous learning
dynamics of the neural network without necessitating a complete retraining of
the principal model for each individual label. Experiments on four benchmark
datasets demonstrate that this approach yields significant enhancements, either
rivaling or aligning with the performance of state-of-the-art techniques
CycloÂhexane-1,2-diammonium bisÂ(pyridine-2-carboxylÂate)
In the dication of the title salt, C6H16N2
2+·2C6H4NO2
−, the two ammonium groups are in the equatorial positions of the chair-shaped cycloÂhexyl ring. In the crystal, the cations and anions are linked by N—H⋯O and N—H⋯N hydrogen bonds, forming a layer network parallel to the ac plane. Weak π–π interÂactions between adjacent pyridine rings with a centroid–centroid distance of 3.589 (2) Å are also present
Bis(2,2′-bipyridine-κ2 N,N′)dichloridoÂplatinum(IV) dichloride monohydrate
In the title complex, [PtCl2(C10H8N2)2]Cl2·H2O, the Pt4+ ion is six-coordinated in a distorted octaÂhedral environment by four N atoms from the two 2,2′-bipyridine ligands and two Cl atoms. As a result of the different trans influences of the N and Cl atoms, the Pt—N bonds trans to the Cl atom are slightly longer than those trans to the N atom. The compound displays interÂmolecular hydrogen bonding between the water molÂecule and the Cl anions. There are interÂmolecular π–π interÂactions between adjacent pyridine rings, with a centroid–centroid distance of 3.962 Å
BoIR: Box-Supervised Instance Representation for Multi-Person Pose Estimation
Single-stage multi-person human pose estimation (MPPE) methods have shown
great performance improvements, but existing methods fail to disentangle
features by individual instances under crowded scenes. In this paper, we
propose a bounding box-level instance representation learning called BoIR,
which simultaneously solves instance detection, instance disentanglement, and
instance-keypoint association problems. Our new instance embedding loss
provides a learning signal on the entire area of the image with bounding box
annotations, achieving globally consistent and disentangled instance
representation. Our method exploits multi-task learning of bottom-up keypoint
estimation, bounding box regression, and contrastive instance embedding
learning, without additional computational cost during inference. BoIR is
effective for crowded scenes, outperforming state-of-the-art on COCO val (0.8
AP), COCO test-dev (0.5 AP), CrowdPose (4.9 AP), and OCHuman (3.5 AP). Code
will be available at https://github.com/uyoung-jeong/BoIRComment: Accepted to BMVC 2023, 19 pages including the appendix, 6 figures, 7
table
RGBD-Dog: Predicting Canine Pose from RGBD Sensors
The automatic extraction of animal 3D pose from images without markers is of interest in a range of scientific fields. Most work to date predicts animal pose from RGB images, based on 2D labelling of joint positions. However, due to the difficult nature of obtaining training data, no ground truth dataset of 3D animal motion is available to quantitatively evaluate these approaches. In addition, a lack of 3D animal pose data also makes it difficult to train 3D pose-prediction methods in a similar manner to the popular field of body-pose prediction. In our work, we focus on the problem of 3D canine pose estimation from RGBD images, recording a diverse range of dog breeds with several Microsoft Kinect v2s, simultaneously obtaining the 3D ground truth skeleton via a motion capture system. We generate a dataset of synthetic RGBD images from this data. A stacked hourglass network is trained to predict 3D joint locations, which is then constrained using prior models of shape and pose. We evaluate our model on both synthetic and real RGBD images and compare our results to previously published work fitting canine models to images. Finally, despite our training set consisting only of dog data, visual inspection implies that our network can produce good predictions for images of other quadrupeds – e.g. horses or cats – when their pose is similar to that contained in our training set
- …